# We don’t have a hundred biases, we have the wrong model

![rw-book-cover](https://readwise-assets.s3.amazonaws.com/static/images/article4.6bc1851654a0.png)

## Metadata
- Author: [[worksinprogress.co]]
- Full Title: We don’t have a hundred biases, we have the wrong model
- Category: #articles
- URL: https://www.worksinprogress.co/issue/biases-the-wrong-model/

## Highlights
- This finding has had implications for research into happiness and hedonic adaptation and how these in turn affect behavior. If our mind uses a TD learning algorithm, it is not the level of the outcome that causes the positive feelings associated with success but prediction errors arising from exceeding expectations. This leads to a possible explanation for the centrality of reference points to Kahneman and Tversky’s prospect theory, whereby our utility is not a function of absolute levels but rather changes. Reference dependence becomes a feature of our learning process rather than a bias – or bug. This algorithm also provides a source of hypotheses about what the reference point should be.
- The challenge of reward shaping is that the computer scientist needs to shape the proximate rewards in a way that the ultimate objective is still achieved. There are many well-known examples of algorithms finding ways to hack the reward structure, maximizing their rewards without achieving the objective desired by its developers. For example, one tic-tac-toe algorithm learned to place its move far off the board, winning when its opponent’s memory crashed in response. With this framing, you can see the parallel with evolution and mismatch. Evolution ultimately rewards survival and reproduction, but we don’t receive a reward only at the moment we produce offspring. Evolution has given us proximate objectives that lead to that ultimate outcome, with rewards along the way for doing things that tended to (over our evolutionary past) lead to reproductive success.